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^ ■ Abstract 

["T i , A new stochastic order between two fading distributions is introduced. A fading channel dominates 



another in the ergodic capacity ordering sense, if the Shannon transform of the first is greater than that 
of the second at all values of average signal to noise ratio. It is shown that some parametric fading 
models such as the Nakagami-m, Rician, and Hoyt are distributions that are monotonic in their line of 
sight parameters with respect to the ergodic capacity order. Some operations under which the ergodic 
capacity order is preserved are also discussed. Through these properties of the ergodic capacity order, it 
is possible to compare under two different fading scenarios, the ergodic capacity of a composite system 
involving multiple fading links with coding/decoding capabilities only at the transmitter/receiver. Such 
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comparisons can be made even in cases when a closed form expression for the ergodic capacity of the 
composite system is not analytically tractable. Applications to multiple 
to multiple-input multiple-output (MIMO) systems are also discussed. 



^ ! I. Introduction 

5-H 



Consider a flat fading channel with additive white Gaussian noise (AWGN), where the receiver 
has perfect channel state information (CSI). The maximum achievable rate of this system, when 
coding is applied across multiple independent channel realizations is known as the ergodic 
capacity, and is given by E [log (1 + pX)], where p > represents the average signal to noise 
power ratio (SNR) of the system, and pX represents the instantaneous SNR random variable 
(RV). This expectation is also known as the Shannon transform of X [2, pp. 44], [3]. 
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In this work, a stochastic order which can be used to compare fading channels based on the 
Shannon transform of the instantaneous SNR is discussed. A fading channel is said to be better 
than another in the ergodic capacity order, if its corresponding ergodic capacity is bigger for 
all p. The proposed order is a kind of stochastic order on positive RVs. Stochastic orders in 
general find applications in economics [4], reliability analysis [5], and actuarial sciences [6]. 
A comprehensive exposition of stochastic orders can be found in [7]. Previously, the stochastic 
Laplace transform (LT) order, which compares the real-valued Laplace transforms of RVs has 
been used to compare two fading distributions and applied to comparing the average error rate of 
M-QAM modulations [8]. This can be explained by the fact that error rates of some modulations 
are non-negative integral mixtures of decaying exponentials, which can also be viewed as the 
Laplace transform. It has been shown in [8] that Laplace transform ordering of instantaneous 
SNRs implies ordering of ergodic capacities, but not conversely. 

The ergodic capacity order presented in Section III of this paper is new to both stochastic 
ordering literature as well as information theory literature. Although this stochastic order was 
first introduced in [1], the current paper offers a detailed discussion of its properties, examples 
and extensions relevant to wireless communications, including the MIMO case. Further, some 
of the convergence properties of the Shannon transform are also studied. In this paper, many 
parametric fading distribution families such as the Nakagami-m, Rician and Hoyt are observed 
to have the property that the ergodic capacity is monotone with respect to the line of sight (LoS) 
parameter for each of these distributions. Consequently, the instantaneous SNR of these fading 
channels serve as examples of ergodic capacity ordered random variables. The properties of this 
stochastic order are useful in obtaining comparisons of the performance of systems involving 
multiple SNR RVs, as described in Section IV. For example, let {Xi}fi 1 and {Yj}^ be two sets 
of fading channels such that the ergodic capacity of X-i is less than that of Y h % — 1, . . . , M at 
all SNR. Then, the properties of the ergodic capacity order provide the conditions under which a 
composite system consisting of {X i }^ 1 as the component fading channels has a smaller ergodic 
capacity than that of a system with components \Y^^L X . Such comparisons of ergodic capacities 
can be made even in cases when a closed-form expression is not available. A MIMO extension 
of the definition of the ergodic capacity order, which can be used to order positive semidefinite 
symmetric random matrices is given in Section V. 
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A. Notations and Conventions 

The set of real numbers, positive integers and complex positive semidefinite symmetric ma- 
trices of size n x n are denoted by R, N, and S" respectively, while all other sets are denoted 
using script font. For a finite set 23 the cardinality is denoted by card 23, while the indicator 
function is defined as I(x G %) = 1, if x G X and 0, otherwise. For any measure //(•), 
is used to represent /i([0,w]). Vectors and matrices are denoted by boldface lower-case 
and upper-case letters respectively. For both the cases, ||-|| denotes the L 2 norm. The trace 
and determinant of a matrix M are denoted by tr M and det (M) respectively. The identity 
matrix is denoted by I. If Oj G R, i — 1, . . . , N, then diag (ai, . . . , an) is the diagonal matrix 
whose element is a^, % — 1, . . . , N. The i th smallest eigenvalue of A G IR iVxiV is denoted 
by Aj(A), % = 1, . . . , N. For a random variable X, F x (x) and fx (x) denote the cumulative 
distribution function (CDF) and the probability density function (PDF) respectively. E is 
used to denote the expectation of the function g(-) over the PDF of X. All logarithms are natural 
logarithms. We write fi(x) = 0(f 2 (x)), x — > a to indicate that limsup a ._ >a (/i(x)// 2 (x)) < oo. 

II. Mathematical Preliminaries 

A. Completely Monotone Functions 

A function g : (0, oo) — > R is completely monotone (cm.), if and only if it has derivatives of 
all orders which satisfy 



for all x > and n G N U {0}, where the derivative of order n = is defined as g(x) itself. 
The celebrated Bernstein's theorem [9] asserts that, g : (0, oo) — > R is cm. if and only if it can 
be written as a mixture of decaying exponentials: 



which is a Lebesgue integral with respect to a positive measure /i on [0,oo). It is straightfor- 
ward to verify that cm. functions are positive, decreasing and convex, and that positive linear 
combinations of cm. functions are also cm [9]. 




(2) 



[0,oo) 
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B. Stieltjes Functions 

Stieltjes functions are a subclass of completely monotone functions, denoted by S. A function 
g : (0, oo) — > [0, oo) belongs to S if an only if it can be written of the form 

g(x) = a/x + b + J (x + u)~ x fi(du) , (3) 

(0,oo) 

where a,b > 0, and /i is a non-negative measure on (0, oo) which satisfies the convergence 
condition f, 0oo \(l + w)~V(dw) < oo. It is easy to show that any Stieltjes function is also a 
double Laplace transform of a non-negative function. A necessary and sufficient condition for 
x !->■ g(x) E S is that x >->■ (g(x^ 1 ))^ 1 also belongs to S [9, p. 66]. 



C. Bernstein Functions 

A function g : (0, oo) — >■ R is a Bernstein function, if and only if g(x) > 0,Vx > 0, and 
dg(x)/dx is cm.. Equivalently, g(x) admits the representation [9, p. 15] 

,(*)=« + fa+ / (l-exp(- ra) )Md»), (4) 

(0,oo) 

for some a,b > 0, where ji is a non-negative measure on (0, oo) satisfying J^ ^ fi(du) + 
/[i oo) u i l {& u ) < 00 • The set of all Bernstein functions is denoted by 

An important property is that the set BJ 7 is closed under positive linear combinations: if e 
fi^ 7 , and > 0, i = 1, . . . , N, then J2iLi a i9i e Some examples of Bernstein functions 
are g(x) = x a , for < a < 1, g(x) — x/(l + x) and g(x) = log(l + x). The representation of 
the capacity function log(l + x) in the form (4) is known as Frullani's integral [10, p. 6], and 
is given by 



oo 

Dg(l + x) = J (l-e~ sx ) ^ds. 



\og(l + x) = I (l-e' sx )—ds. (5) 

D. Thorin-Bernstein Functions 

A Bernstein function g is called a Thorin-Bernstein function [9, pp. 73-79], if it admits the 
representation given by (4), where sfi(s) is cm.. The family of all Thorin-Bernstein functions is 
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denoted by TBT . A necessary and sufficient condition for g : (0, oo) — > (0, oo) to be in TBT 
is that g can be represented as follows [9, p. 73]: 



for some a, b > and ji is a positive measure on (0, oo), which satisfies the convergence 



property that g\(g 2 (-)) e TBT whenever g x G TBT as a composable Thorin-Bernstein function 
(we denote the set of all such functions by CTBF). A sufficient condition for any g 2 to belong 
to CTBJ 7 is that g 2 G TBJ 7 and (dg 2 (x) / dx) / g 2 (x) G S. Functions belonging to the class TBJ 7 
are of particular relevance to this paper, since the Shannon capacity function C(x) := log (1 + x) 
not only belongs to BJ 7 , but also belongs to TBJ 7 , as seen from (5) and (6). 

It is useful to define a multivariate extension of a Thorin-Bernstein function. A function 
g : W a — > R belongs to TBFm if g(x 1: . . . , x m ) is a Thorin-Bernstein function in each argument, 
when all other arguments are treated as constants. Further, if g is composable in each variable 
when all other variables are fixed, then g is said to belong to the set CTBT m . An example of 
function in CTBT m can be verified to be g(xi, . . . , x M ) = YldLi a i x ii «i > 0, i = 1, . . . , M. 

E. Matrix Functions 

Let : R — >■ R. If D = diag (Ai, . . . , A N ), we define 0(D) = diag (0(Ai), . . . , 0(A N )). If 
A G , so that A = Udiag (Ai(A), . . . , An(A)) U h , where U is a unitary matrix, then we 
define 0(A) = U0(D)U H . In this way, 0(A) can be defined for all Hermitian matrices of any 
order [11]. In this work, the scalar function and its matrix extension are denoted using the same 
symbol, and the argument of the function defines the specific context. Matrix functions find 
applications in Section V. 

F. Integral Stochastic Orders 

Let Q denote a class of real valued functions g : R + — > R, and X and Y be random variables 
(RVs). We define the integral stochastic order with respect to Q as [6]: 




(6) 



(0,oo) 



condition J Q \ logs|;u(ds) + s V( ds ) < °°- We refer to any g 2 G TBT which satisfies the 



X < g Y^E[g(X)]<E[g(Y)} , Vg G Q . 



(7) 
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In this case, Q is known as a generator of the order <g. We now give an example of an integral 
stochastic order relevant to this paper, by specifying the corresponding generator set of functions 
Q. 

1) Laplace Transform Order: This partial order compares random variables based on their 
Laplace Transforms. Here, Q = {g(x) : g(x) = — exp (— px) , p > 0}, so that X < Lt Y if and 
only if 

E [exp(-pF)] < E [exp(-pX)] , V p > . (8) 
One useful property of LT ordered random variables is that for all cm. functions g, we have 

E [g(Y)] < E [g(X)] . (9) 

In other words, the generator Q can be enlarged to the set of all cm. functions without changing 
the stochastic order [6]. Further, whenever g e BJ-', (9) holds with a reversal in the inequality. 
In a wireless communications context, let p > be the average SNR, and pX, pY represent the 
instantaneous SNRs of two fading distributions. If g(x) corresponds to the instantaneous symbol 
error rate P c (px) of a modulation scheme with cm. error rate function, then (9) can be used 
to obtain comparisons of averages of symbol error rates over pairs of fading channels, even in 
cases where a closed-form expression for the same is intractable. 

Another useful result for Laplace transform ordered RVs is that, if X 1 , . . . , Xm are independent 
and Yi, . . . , Y M are independent, and if X m < Lt Y m , m = 1, . . . , M, then g (X i: . . . , X M ) <u 
g (Yi, . . . , Y M ), whenever g(x 1: . . . , x M ) is a Bernstein function of x, h when all other arguments 
of g are viewed as constants [9]. 

G. Shannon Transform 

In what follows, we formally describe the Shannon transform, which is the basis of the 
proposed stochastic order in this paper. The Shannon transform of a non-negative random variable 
X is defined as [2, pp. 44]: 

C {X) (p) :=E[log(l + pX)] ,p>0. (10) 
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Two new representations of C (p), which are useful in this paper are now obtained. Using 
(5), it is easy to show that (10) can be represented as a Laplace transform, given by 

oo 


for p > 0, where 4>x{u) '■= E [exp(— uX)\ , u > 0. Using (2) with (11), it is immediate that 

(X) (x) 

C (p) is a completely monotone function of 1/p. A second representation of C (p) which 

(X) 

can be derived from (11) shows that C (p) is also the Stieltjes transform [12, p. 325] of the 
complimentary CDF of X, when evaluated at 1/p: 

oo 

7^*) f \ f 1 ~ Fx no\ 

c = i WM du ' ( ' 

o 

where p > 0. Representation (12) is used in proving some properties of the ergodic capacity 
order discussed in Section III-B. Additionally, (12) permits us to comment on the convergence 

of C (X) (p): 

Proposition 1. If C (p) exists for any p e (0, oo), then C (p) exists for every p e (0, oo). 

Proof: From (12), it is seen that C (p) is the Stieltjes transform of a real valued function. 
If the Stieltjes transform of a function exists at any point on IR + , then it exists at all points on 
K+ [12, p. 326]. This completes the proof. ■ 
We now provide examples of random variables for which the ergodic capacity is finite for 
p < oo using the following proposition: 

Proposition 2. Let Fx (•) denote the cumulative distribution function of a RV X. If for some 
5 G (0, 1], f* 1 - F x (u) du = Oit 1 - 5 )^ -> oo, then C (X) (p) < oo. 

Proof: First, observe that J °°(s+t)~ 1 da(t) exists if a(t) = Oft 1 ' 5 )^ — > oo, for some S > 
[12, p. 330 (Theorem 3b)]. The proposition then follows by letting a(t) = J* 1 — F x (u)du. 
This completes the proof. ■ 
In Proposition 2, the case of 5 = 1 is equivalent to the condition that the mean of X is finite. 
It is therefore straightforward to see that the ergodic capacity of fading distributions such as 
Nakagami-m and Rician is finite at all finite SNR, since these distributions have finite average 
power. We now proceed to define a stochastic order for comparing fading distributions based on 
the Shannon transform. 
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III. The Ergodic Capacity Order 

Recall that the ergodic capacity of a single-input single-output (SISO) system is given by 
E[log(l + pX)\, where X is the square of the amplitude of the complex fading gain, and is 
defined as the instantaneous fading power of the channel. It is straightforward to see through an 
application of Jensen's inequality that the AWGN channel (with no fading) outperforms every 
fading distribution with same average channel power, in terms of the ergodic capacity at all 
SNR. However, given two fading distributions, it is not trivial to compare them based on the 
ergodic capacity, as obtaining a closed-form expression for the ergodic capacity of many fading 
channels is analytically intractable. Motivated by this, we propose a stochastic ordering method, 
which can be used to compare the ergodic capacity of two different fading channels. 

A. Definition 

Definition 1. If X and Y are arbitrary non-negative RVs, then X is said to be dominated by 
Y in the ergodic capacity order (i.e. X < c Y), if and only if C (p) < C (p) ,Vp > 0. 

For this stochastic order, the generator is chosen as Q = {g(x) : g(x) — log (1 + px) , p > 0}. 
In the situation that both these terms are infinite, we still say that X < c Y. Distributions of 
interest for which the ergodic capacity is finite at all finite SNR can be determined using either 
Proposition 1 or Proposition 2. Next, some useful properties of the capacity order and a few 
examples of ergodic capacity ordered RVs are discussed. 

B. Properties 

The following properties hold for non-negative RVs. 

SI: X < c Y E [g(X)\ < E [g(Y)\, Vg e TBT. 
S2: X < c Y & g(X) < c g(Y), \/g e CTBF. 
S3: X < u Y X < c Y. 

S4: Let X\,..., X M independent and Yi, . . . , Y M independent. If X m < c Y m , m — 1, . . . , M, 

then g (X u ...,X M )< c g(Y 1 ,..., Y M ), Vg e CTBJF m . 
S5: If X < c Y and Y < c Z, then X < c Z. 
S6: If X < c Y and Y < c X, then F x (•) = F Y (•) a.e.. 
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The proofs of these properties can be found in Appendix A. A straightforward implication 
of Property SI is that if X < c Y, then E [X] < E[Y], since g{x) = x is a Thorin-Bernstein 
function. In other words, if one fading channel has a higher ergodic capacity than another at all 
SNR, then it is necessary that the average fading power of the first channel is no smaller than that 
of the second. We further observe that Property S4 is true when X m < Lt Y m , m — 1, . . . M with 
g being a BJ 7 in each variable, while viewing all the other variables as constants [7]. Properties 
S5 and S6 together constitute the definition of a partial order, and consequently < c is a partial 
order on non-negative RVs. 

Interpreting pX and pY as the instantaneous SNRs of two different fading channels, Properties 
S1-S6 are useful in obtaining the conditions under which the ergodic capacity of a composite 
system with coding/decoding capabilities only at the transmitter/receiver under the channel Y is 
greater than that under X at all SNR. Although Property S3 suggests that every pair of Laplace 
transform ordered random variables also obey the ergodic capacity order, the converse is not 
true in general. A counterexample can be found in [1], [8]. Thus, it is possible that the average 
symbol error rate of differential binary phase shift keying modulation in channel X is less than 
that in Y at high SNR, while the situation reverses when the capacity achieving code is applied 
on both channels. Interpreting the ergodic capacity as what is achievable by coding over an i.i.d. 
time-extension of the channel, we reach the conclusion that even though Y offers more diversity 
than X for an uncoded system, the i.i.d. extension of X lends itself to more diversity than that 
of Y . To put it more simply, at high SNR, it is possible for one fading channel to be superior to 
another in terms of error rates in the absence of coding, while being inferior when the capacity 
achieving code is employed over both channels. 

C. Examples 

Next, we give examples of pairs of RVs X, Y relevant to wireless communications, for 
which X < c Y holds. In general, establishing capacity ordering using its definition is often 
inconclusive, since the corresponding integrals are intractable. Fortunately, using Property S3, it 
is possible to provide examples of pairs of RVs which obey capacity ordering. In what follows, 
examples of parametric fading distributions which obey the ergodic capacity order are given. 
These distributions are also known to satisfy the Laplace transform order [8]. 
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1 ) Nakagami Fading: The Nakagami-m fading model, for which the envelope \[X is Nak- 
agami distributed, and the instantaneous fading power X is Gamma distributed, with PDF given 
by 

fx (x) = ^x 1 "- 1 exp(-mx) , x > , (13) 
1 (m) 

where m > is the line of sight parameter, and T(r) := f °° i r_1 exp(— t)dt is the gamma 
function. If X ~ Gamma(m x ), and Y ~ Gamma(m y ) with m x < m Y , then X < c Y. 

2) Rician Fading: The Rician fading model: In this case, the envelope of the fading i.e., \fX 
is Rice distributed with line of sight parameter K, and the corresponding instantaneous fading 
power distribution is given by 

fx (x) = (K + 1) exp [-(K + l)x - K] J (2^K{K + l)x) , (14) 

where I (t) := X]m=o(^/2) 2m /(^!r(m + 1)) is the modified Bessel function of the first kind 
of order zero. If the distribution of X and Y have parameters K x and K Y respectively, with 
K x < K Y , then X < c Y. 

3) Hoyt Fading: The Nakagami-g (Hoyt) fading model: Here, the envelope of the fading RV, 
given by \fX is Hoyt distributed, and the density of the (unit mean) instantaneous fading power 
is given by 

fx (x) = aexp(-a 2 x)I (bx) , (15) 

where a = (1 + q 2 )/2q, b = (1 — g 4 )/4g 2 . If X and Y have parameters q x and q Y respectively, 
where q x < q Y , then X < c Y. 

For the cases of Nakagami, Rician and Hoyt fading, the increase in ergodic capacity with 
increase in the LoS parameter of the distribution is not due to an increase in the average fading 
power, since E [X] = E [Y], which is independent of the LoS parameter. 

In what follows, we show that ergodic capacity ordering of a given SISO system under two 
different fading channels can be used to make meaningful conclusions when a number of such 
systems are combined to form a system involving multiple random variables. 

IV. Systems Involving Multiple Random Variables 

In order to illustrate the applicability of the capacity order to compare the performance of 
systems, we provide examples of composite systems where capacity ordering of component SISO 
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systems can be used to conclude the capacity ordering of the system, and also some applications 
where this is not necessarily the case. Such generic conclusions can be made even when closed 
form expressions for the ergodic capacity are not available. Throughout, we assume that the 
receiver has a perfect estimate of the instantaneous fading power, while the transmitter does not 
possess any such information. 

A. Diversity Combining Systems 

As examples of systems involving multiple fading links, we first consider diversity combining 
schemes such as MRC and EGC using M receive antennas, for which we aim to compare the 
ergodic capacity under two different fading scenarios. Using the properties of the ergodic capacity 
order, we now show that diversity combining systems formed using a better set of components 
yields a system with a higher ergodic capacity, for the two schemes considered. 

1 ) Maximum Ratio Combining: Conditioned on the instantaneous fading power X m = x m , 
m — 1, . . . , M, the fading power after combining is given by 

M 

9mkc( X 1i ■ ■ ■ » x m) = ^2 x ™ ■ ( 16) 

m=l 

The ergodic capacity corresponding to this combining scheme is given by 

Cffic (P) = E [log (1 + pg MRC (X 1 ,..., X M ))} . (17) 

In order to compare the ergodic capacity of MRC in two different fading environments charac- 
terized by instantaneous fading powers (X i: . . . ,X M ) and (Yi, . . . ,Y M ), we first verify that 
fiWc(') i s a composable Thorin-Bernstein function. Then, we use Property S4 to conclude 
Culc (p) < Cmrc (p) , Vp > 0, when X m < c Y m , m = 1, . . . , M. 

In order to check if <? MRC (-) G CTBJ 7 , we first show that g G TBJ 7 . To this end, treat x\ in 
<7 MRC ( • ) as the variable, while treating other arguments as constants, to get g MRC {x\ ; x 2 , . . . , x M ) = 
xi + k, where k = J2m=2 x ™- Clearly, g MWC (x±; x 2 , . . . , xm) G TBF, since it satisfies the 
conditions of (6) with a = k, b = 1 and /j,(u) = 0. Thus, by definition, g MRC G TBJ 7 . Next, 
SWc is shown to belong to CTBT: g' MRC (xi, x 2 , . . . , x M )/g MRC (xi; x 2 ,..., x M ) = (xi + k)' 1 , 
which is a Stieltjes function, as it satisfies (3) with a = 0, b = 0, and fx(s) = S(s). 

Now, assuming X m < c Y m , m — 1, . . . , M, we have from Property S4 g MBC (X 1 , . . . , X M ) <c 

(X} (Y) 

g MKC {Y 1 , Y M ), which implies C MRC (p) < C MRC (p) , Vp > 0. Thus, if Y m dominates X m 
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in the capacity order for m = 1,...,M, then the MRC system with fading links given by 
Yi, . . . , Y M will have a higher ergodic capacity than that with X 1: . . . , X M at all SNR. 

2) Equal Gain Combining: For the case of equal gain combining, the ergodic capacity is 
given by 

Cgc (P) = E [log (1 + Wbgo (Xi, • • • , X M ))] , (18) 
where <? EGC (-) represents the combined instantaneous fading power, and is given by 

( M \ 2 

5 , EGc( a 'l' • • • ) X M) = 

In order to verify that g EGC G TBJ 7 , treat x 1 as the variable and all the other arguments of 
g EGC as constants, so that g EGC (xi,x 2 , ■ ■ ■ ,x M ) = x 1 + 2^/x^k + k 2 , where k = 2~2m=2 x ™- 
Clearly, g EGC (xi, x 2 , ■ ■ ■ , xm) G 713 J 7 , since it satisfies the conditions of (6) with a = k 2 , 
6=1 and /i(s) = ks~ l l 2 /n. Repeating the same argument with all other parameters of g EGC (-), 
we conclude that g EGC e TBJ 7 . In addition, g EGC (-) is composable, as seen from the following 
arguments. Let h(x) := g' EGC (x; k)/g EGC (x; k) = (x + ky/xT) -1 . To show that h e S, observe that 
(/i(x -1 )) -1 = x _1 +A;a; _1//2 is a Stieltjes function, since any function of the form x a-1 , < a < 1 
is a Stieltjes function [9, p. 13], and positive linear combinations of Stieltjes functions also yields 
a Stieltjes function. To complete the argument, since (/i(x -1 )) -1 G S, h(x) must also belong to 
S [9, p. 66]. Consequently, g EGC (-) G CTBT. 

Now, following similar arguments as in the MRC case, we infer that if a collection of SISO 
systems with higher ergodic capacity is combined to form an EGC system, then the composite 
EGC system will have higher overall ergodic capacity. 

B. Multi-Hop Amplify-Forward Relay System 

We now turn our attention to multi-hop amplify-forward (AF) relay systems. For this system, 
even though a set of better channels (in the ergodic capacity order) is combined to form the 
relay, it need not necessarily result in a relay with a higher ergodic capacity, as described below. 
Consider an M-hop AF relay (refer Figure 1), which is described as follows: 

y m = \/a m ph m s + v m , m = 1, . . . , M , (20) 

where y m , m — 1, . . . , M — 1 is the received signal at the m th half-duplex variable-gain relay, 
h m is the complex ergodic fading on the m th hop, which is i.i.d across time, s is the transmitted 
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symbol, p is the average SNR, v m is the AWGN which is i.i.d across hops as well as time, 
and y M is the signal received at the destination. It is assumed that coding/decoding capabilities 
are provided to the transmitter/receiver alone, and the m th relay, which has knowledge of X m , 
merely amplifies and forwards the signal using a variable gain given by a m = p/(pX m + l) [13], 
where X m := \h m \ 2 is the instantaneous fading power of the m th hop, for m = 1, . . . , M — 1. 
In this case, the ergodic capacity is given by 

(P) = E fog (1 + P<?mh- AF (*1, • • • , X M ))] , (21) 

where g MH _ AF (x 1: . . . , x M ) ■= (E[if=i [C 1 + x m)\ ~ Exact expressions for the ergodic 
capacity in arbitrary fading channels are intractable, even for the two-hop case. Previously, 
the ergodic capacity of such a relay in fading channels has been obtained as an infinite series 
in [14]. 

In order to compare the performance of the AF relay in two different fading scenarios, let 
X m and Y m denote the instantaneous fading power of the m th link of the first and second fading 
channels respectively, for m — 1, . . . , M. Further, assume that X m < c Y m , m = 1, . . . , M. In 
order to use Property S4 to obtain conclusions about the comparative ergodic capacities, we first 
check if <7 MH _ AF G CTBJ 7 . To this end, we first check if g M11 _ AF G TBJ 7 . Treating xi as the 
variable and all other arguments of <7 MH _ AF (-) as constants, we get <? MH _ AF (£±; x 2 , ■ ■ ■ , %) = 
kx 1 /(x 1 + k), where k = g MH _ AF (x 2 , • • • , x M ). From [9, pp. 218], it is seen that g MH _ AF i TBJ 7 . 
Therefore, g CTBF, and Property S4 cannot be used to conclude anything about the ergodic 
capacity of X versus that of Y. Nevertheless, by straight-forward differentiation with respect 
to Xi, it is observed that <? MH _ AF (-) is a Bernstein function of Xi. Therefore, <? MH _ AF G BJ 7 , but 
g MH _ AF ^ TBJ 7 . As a result, if the instantaneous fading powers also satisfy the stronger criterion 
X m < Lt Y m , m = 1, . . . , M, then we get g Mil _ AF (X 1 , X M ) <u V-afI^i- • • • > y m), and 
therefore g MU _ AF (X 1 , X M ) < c fi , M H-Ap( y i ) • • • > y m)- However, if X rn < c Y m , while X m ^ Lt 
Y m , m — 1, . . . , M then we cannot conclude <7 MH _ AF 

S'mh-af 

latter is observed in the case of an interference dominated channel, where the instantaneous fading 
power to interference power ratio X m are independent and Pareto distributed with parameter /3x- 

Fx - ^ = YT^ ,X > > ' (22) 

and Y m similarly with parameter j3 Y , where j3 x < fiy- For example, Fig. 2 illustrates the 
numerically evaluated ergodic capacities of a multi-hop relay with M = 3 hops under Pareto 
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distributed signal-to-interference ratio with parameters f3 x — 1 and (3 Y = 3, so that for each hop 
X m < c Y m , m — 1, 2, 3 is satisfied. It is observed from Fig. 2 that for p < p , where p 5 dB, 
X is a better channel than F in the capacity order, while for p > p , the situation is reversed. 

C. Fading Multiple Access Channel 

In this example, we focus on comparing the ergodic capacity regions of a multi-user Gaussian 
MAC network in two different fading scenarios. Consider the following system model: 

M 

y = "/p hmSm + v ' ( 23 ) 

m=l 

where p is the average SNR of each user, s m is the transmitted symbol of user m, h m is the 
complex i.i.d (across time) ergodic fading between each user and the destination, and v is the 
AWGN at the receiver. It is assumed that only the receiver possesses CSI of all the users. The 
receiver intends to decode the signals from all the users. If X m := \h m \ 2 ,m = 1, . . . , M, then 
the ergodic capacity region is the set of all rate M-tuples that satisfy [15, pp. 407], 

where S C 2< 1 -- M >. Let g MAC jX 1 ,...,X M ) := Z s X m . Clearly, g MACS (X u ...,X M ) e 
CTBFcard s, if X m < c Y m , m = 1, . . . , M, from Property S4 it follows that 

g MAG jX 1: ...,X M )< C g MAC jY 1: . . .,Y M ),\/S c 2< 1 -- J "> . (25) 

Thus, if each user of the system X has a higher ergodic capacity than the corresponding user 
in the system Y, then C MAC (p) C C MAC (p) , Vp > 0. 

V. MIMO Ergodic Capacity Order 

In this section, the ergodic capacity ordering of MIMO systems is presented. Some properties 
of this stochastic order are discussed, and an application of this framework in a MIMO MAC 
setting is presented. Before doing so, we formally define a MIMO system through its single 
letter characterization: 

y = v/pHs + v , (26) 

where H is a complex N R x N T random matrix which captures the effect of ergodic quasi-static 
fading, v ~ CAf (0,1) is the additive noise, s is the transmitted symbol vector, and p is the 
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average SNR per transmit antenna. H and v are assumed to be i.i.d across time, as a result of 
which a time index has not been used in (26). Further, it is assumed that the receiver tracks the 
channel fading realizations H, while no such CSI is available at the transmitter. For this system 
model, the ergodic capacity is given by C MIMO (p) = E [logdet (I + pX)], where X denotes the 
instantaneous fading power, which equals H H H. In what follows, we define a partial order on 
the instantaneous fading power, which can be used to compare the ergodic capacity of composite 
MIMO systems under two different fading environments. 

A. Definition and Properties 

Definition 2. If two random matrices X, Y belong to then X ^ c Y if and only if X ;< c 

Y E [tr log (I + pX)] < E [tr log (I + pY)] , Vp > 0. 

In Definition 2, log(-) is to be viewed as a matrix function, in the sense of Section II-E. It is 
easy to show that X ^ c Y is equivalent to E [log det (I + pX)] < E [log det (I + pY)] , Vp > 0. 
In contrast to the ergodic capacity order on random variables, the MIMO ergodic capacity 
corresponding to two different random matrices X and Y may be identical (for example, when 

Y = UXU H , where U is a unitary matrix). In this circumstance, we write X — c Y. In what 
follows, some properties of the MIMO ergodic capacity order are developed, which can be viewed 
as matrix analogues to the properties developed in Section III-B. The following properties are 
true for positive semi-definite symmetric random matrices. 

Ml: If X, Y G S T ;, then X ^ c Y E [tr g(X)\ < E [tr g(Y)\, for all g : R -> R, such that 
g G TBT. 

M2: If X, Y G SI, then X ^ c Y g(X) ^ c g(Y), for all g : R ->■ R, such that g G CTBT . 

M3: If X, Y G SI and E [tr exp(-pX)] > E [tr exp(-pY)] Vp > then X ^ c Y. 

M4: Let X 1:M := [X 1? ...,X M ], Y 1:M := [Y 1: . . . ,Y M ], where Xj G S" are independent 
random matrices, for % — 1, . . . , M, and likewise for Yj. Let g(Xi; M ) '■= <?(Xi, . . . , X M ), 
i.e., g operates on M §" matrices and produces a S" matrix. If g : R M — y R is such that 
g G CTE7 m then #(X 1:M ) ^ c g(Y 1:M ). 

M5: If X ^ c Y, and Y ^ c Z, then X ^ c Z. 

M6: X = c Y if and only if Y%=i ^(x) (u) = Y^=i F Kpr) (u), where F Ai(X ) (•) is the marginal 
CDF of the i th largest eigenvalue of X. 
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The proofs of properties M1-M4, and M6 can be found in Appendix B, while Property M5 is 
straight-forward to establish, and its proof is omitted. Property M3 provides a useful sufficient 
condition to verify if two random matrices obey the MIMO ergodic capacity order. This is because 
E[tr exp(-pX)] > E [tr exp(-pY)] Vp > is equivalent to £? =1 E [exp(-pA;(X))] > 
Y^i=i E [exp(— pAj(Y))] , Vp > 0, and Laplace transforms of the eigenvalue distributions are 
more easy to compute, when compared to the expectations of the log-determinants. Next, we 
form an interesting interpretation of Property M6. From Property M6, it follows that X = c Y 
if and only if Eq[F Aq ( X ) (u)] = Eq[F Aq ( Y ) (u)], where Q is a uniformly distributed integer over 
[l,n\. In other words, if the distribution of an eigenvalue picked randomly and uniformly from 
both matrices is identical, then the two random matrices are regarded to be the same with respect 
to the MIMO ergodic capacity order. 

Although the proposed definition of the MIMO ergodic capacity order is one of many different 
possible partial orders on matrices, we assert that it is a natural generalization of the ergodic 
capacity order defined in Section III. This is also elucidated by the fact that the properties MI- 
MS are indeed straight-forward matrix generalizations of properties S1-S5. Further, the MIMO 
ergodic capacity order bears the following connection with the ergodic capacity order defined 
for random variables: 

Theorem 1. Let Aq(X) < c Aq(Y), where Aq(X) is an eigenvalue o/X picked uniformly from 
the set of eigenvalues o/X. Then X ^ c Y. Conversely, ifX. ^ c Y, then Aq(X) < c Aq(Y). 

Proof: See Appendix C. ■ 
Given two MIMO fading systems X and Y, Theorem 1 implies that Y dominates X in the 
MIMO ergodic capacity order, if and only if a uniformly randomly selected eigen-channel of Y 
has a larger ergodic capacity than that of a uniformly randomly selected eigen-channel of X. 

An illustrative example to elucidate the efficacy of the MIMO ergodic capacity order is the 
M user Gaussian MIMO MAC, where user % possesses N t antennas. We assume that only the 
receiver has CSI, and that each antenna of each user transmits independent signals. Further, 
each user is allocated the same transmit power p per transmit antenna. In this case, the ergodic 
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pX 

^MIMO-MAC 



capacity region C MIMO _ MAC (p) is given by [16]: 

(p) := <^ {R u ...,R M ):J2 R i< E [ lo S det i 1 (Xl:Af))] , 

I ies 

VSC{1,...,M}} , (27) 

where 3 MIMO _ MACS (Xi :M ) := £ X*, with S C {1, . . . , M }. Clearly, when X; is assumed to be 
the variable while viewing all other arguments of <7 MIMO _ MACS (-) as constant matrices, it can 
be seen that <7 mimo _ M acs(') * s a Thorin-Bernstein matrix function of Xj, for % = 1, . . . , M. 
Therefore, through property M4, # MIMO _ MACiS (X 1:M ) ^ c # MIMO _ MAC , s (Y 1:M ), whenever X< ^ c 
Yi,i = l,...,M. Consequentially, C^ IMO _ MAC (p) C C^ IMO _ MAC (p), Vp > 0. 

VI. Conclusion 

The ergodic capacity order and its properties can be exploited to obtain comparisons of ergodic 
capacities of composite systems across two different fading channels whose instantaneous SNRs 
satisfy the ergodic capacity order. For systems such as MRC and EGC which involve multiple 
instantaneous SNR RVs, we conclude that combining a better set of channels (in the ergodic 
capacity order) produces a system with a higher ergodic capacity. This conclusion is true for all 
systems whose end-to-end instantaneous SNR belongs to the CTBT m set. For systems whose 
end-to-end SNR does not belong to CTBJF m , component-wise ergodic capacity ordering of 
instantaneous SNR need not produce a system with a higher ergodic capacity. An example to 
illustrate this point is the AF relay for which the instantaneous SINR is Pareto distributed. An 
extension of the ergodic capacity order to MIMO systems is also proposed herein. The properties 
of the ergodic capacity order can be used to compare the capacity regions of systems such as the 
multi-user MAC in two different fading environments, for both the single and multiple antenna 
case. 

Appendix A 

Proofs: Properties of Ergodic Capacity Order 
Proof of Property SI 

Let X and Y be non-negative RVs, and g e TBJ 7 . Using (6), we can write 



E[g(X)}=E 



a + bX + J log ^1 + — ) p(ds) 



s 







(28) 
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for some a, b > and p,(s) > 0, which satisfies the convergence conditions following (6). 
Commuting the expectation and integral in (28), we get 

oo 

E [g(X)\ = a + bE [X] + J E 

o 

oo 

< a + bE [Y] + J E 

o 

= E [g(Y)} , 

where (30) follows from (29) and the fact that X < c Y E [X] < E [Y]. To prove X < c 
Y =>■ E [X] < E [Y], for p > 0, we have (1/»E [log (1 + pX)] < (l/p)E [log (1 + pF)]. Now, 
moving the 1/p term inside the expectation on both sides, we take the limit as p — > 0. The limit 
and expectation can be commuted, since E [log (1 + pX)\ < E [X] Vp > 0, and consequently, 
the dominated convergence criterion is satisfied. This result can also be shown to hold for the 
case when E [Y] = oo, by using the relation E [log (1 + Y)] < E [Y]. 

To prove the converse, we begin with E [<?(X)] < E [g(F)] for g e TBJ- '. Choosing g(x) = 
log(l + x), which is TBJ 7 , we get X < c Y. This concludes the proof. 

Proof of Property S2 

Let X, Y be non-negative RVs, and g, : R + ->■ E G TBI '. Since (f) o g e TBT if and only 
if g e CTBF, from Property SI, if X < c Y, we have R[<f>(g(X))] < R[<f>(g(Y))]. Since e 
T^J 7 , this implies ^(X) < c ^(F). 

To prove the converse, consider < c (7(F) for ^ e CTBT ' . Choosing ^(x) = we get 

X < c which completes the proof. 

Proof of Property S3 

Recall from Section II-F1 that X < Lt Y ^E [g(X)\ < E [g(Y)} for all g e BJ= . The property 
then follows by observing that g(x) = log(l + x) is a Bernstein function. 

Proof of Property S4 

This property is proved using mathematical induction. To begin with, let (f) : R — >■ IR + G TBT , 
and x i:m := [Xi,...,X m ] have independent and non-negative RVs as components. Assume 
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likewise for y 1:m := [Yi, . . . ,Y m ]. Now, for m = 1, Property S4 is true due to Property S2. 
Next, let us assume Property S4 to be true for vectors of length to — 1. Thus, for any g e 
CTBT m , it is straightforward to see that g([a Xi :m _i]) < c g([a yi :m _i]), where g([a Xi :m _i]) : = 
g(a, X-i, . . . , X m _i), where a e R + . This implies 

E [0 (g([a x 1:m _!]))] < E [0 (<?([a yi :m _i]))] . (31) 

Next, for vectors of length to, consider 

E [0 (<?(x 1:m )) |Xi = x] = E [0 x 2:m ]))] (32) 

< E [0 (g([x y 2:m ]))] = E [0 ((?(y 1:m )) l^i = , (33) 

where (33) follows from (32) due to (31). Now, taking the expectation with respect to X 1 on 
the left hand side of (32) and the right hand side of (33), we get E [0 (g(X 1: . . . , X m ))\ < 
E [0 (g(Yi, . . . , Y m ))}. Equations (32) and (33) can be repeated by conditioning on all the other 
RVs to complete the proof. 

Proof of Property S5 

Assume X < c Y, and Y < c Z. By definition, this implies E\og (1 + pX) < E\og (1 + pY),Vp > 
0, and Elog (1 + pY) < Elog (1 + pZ), Vp > 0. Therefore, Elog (1 + pX) < E\og (1 + pZ), Vp > 
0, which implies X < c Z. 

Proof of Property S6 

To establish Property S6, it is sufficient to show that the Shannon transform of a RV has a 

one-to-one correspondence with the distribution of the RV. To this end, recall from (12) that 

fx** 

for any non-negative RV X, C (p) can be written as the Stieltjes transform of 1 — F x (•)• 
Since the Stieltjes transform of a function with bounded variation is unique [12, p. 336], and 
1 — F x (•) is of bounded variation, the property follows. 



February 22, 2013 



DRAFT 



20 



Appendix B 

Proofs: Properties of MIMO Ergodic Capacity Order 
Proof of Property Ml 

Assume X ^ c Y. Using the identity det (I + pX) = ni=i (1 + P^iOty)> we can write 



X ^ r Y E 



Vp > 0. It now follows that 



^io g (i + A(x)) 



i=l 



X ^ c Y ^ E 



^log(l + ^A i (X))^(*) 



i=l 



< E 



< E 



^l g(l+pA,(Y)) 



i=i 



]Tlog(l+tpA,(Y))p(t) 



i=i 



(34) 



(35) 



for all p(t) > 0, p > and £ > 0. Integrating the right hand side of (35) over t in the interval 
[0, oo) preserves the inequality in (35). Therefore, 



X ^ r Y E 



oo oo 
n „ n „ 

^ / log(l + *A(X)V(d£) <E ^ / log(l + *A(Y))Md*) 

1=1 n 1=1 n 



, (36) 



Vp > 0. The summand in (36) is an arbitrary Thorin-Bernstein function, since p is an arbitrary 
non-negative function. Denoting this Thorin-Bernstein function by g, the direct part of the 
property is proved by observing from Section II-E that E [^™ =1 g(Aj(X))] = E[trg(X)]. To 
prove the converse, choose g(A) = log (I + pA). 

Proof of Property M2 

Let X, Y G §™, and X ^ c Y. Let : R ->■ R belong to TBJ- ', and g : R ->■ R belong 
to CTBJ 7 . Using the definition of matrix functions, it is easy to see that /(X) := 0(g(X)) G 
T-BJ 7 . From Property Ml, it is seen that X ^ c Y E [tr 0(#(X))] < E [tr <f>(g(Y))]. In other 
words, <?(X) ^ c g(Y), which proves the direct part of the property. To see the converse, choose 
/ as the identity map. 

Proof of Property M3 

Let X, Y G and X ^ c Y. Using Frullani's formula (5), it is evident that an equivalent 
condition to X ^ c Y is given by 



X ^ r Y <=> E 



00 n 



(1 - exp(-psA i (X))) ds 



< E 



— ^(l-exp(-p S A,(Y)))d S 

(37) 



i=l 
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Commuting the expectation and integral in (37), we get 



oo 

/ 



-E 



^exp(-psAi(X)) 



i=i 



ds > 



-E 



^exp(-psAi(Y)) 



ds. (38) 



Therefore, if E [£)" = i exp(-pA;(X))] > E E" =1 exp(- / oA i (X))] , p > 0, then X ^ c Y. The 
property then follows through the observation that E E"=i ex P(~ P^i(X))] = E [tr exp(— pX)]. 



Proof of Property M4 

This property is proved using mathematical induction. To begin with, choose a matrix function 
G TBJ 7 , and X i:m := [Xi, . . . ,X„J have independent and non-negative random matrices as 
components. Assume likewise for Yi :m := [Yi, . . . , Y rn ]. Now, for m — 1, Property M4 is true 
due to Property Ml. Next, let us assume Property M4 to be true for sequences of length m — 1. 
Thus, for g G CTBJ= m we have g([C Xi :m _i]) ^ c g([C Yi :m _i]), where g([C Xi :m _i]) := 
g(C : X 1; . . . , X m _!), and C G S" . This implies 

E [tr («?([C X 1;m _!]))] < E [tr (<?([C Y^]))] . (39) 

Next, for sequences of length m, consider 

E [tr (</(X 1:m )) [Xi = A] = E [tr (<?([A X 2:m ]))] (40) 
< E [tr (g([A Y 2:m ]))] = E [tr (<?(Y 1:m )) |Y X = B] , (41) 

where (41) follows from (40) due to (39). Now, taking the expectation with respect to Xi on 
the left hand side of (40) and the right hand side of (41), we get E [tr (#(Xi, . . . , X m ))] < 
E [tr (<?(Y 1? . . . , Y m ))]. Equations (40) and (41) can be repeated by conditioning on all the 
other random matrices to complete the proof. 

Proof of Property M6 

To prove this property, let X, Y G § T j, and E [logdet (I + pX)] = E [logdet (I + pY)]. Using 
the representation of the log-determinant in terms of the eigenvalues, and (12), it is seen that 

E [log det (I + pX)] = E [log det (I + pY)] J l ~ x l j p + u <*u = J l — 1 ^ p + u dw . 

o o 

(42) 
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To see the direct part of the Property, recall the Stieltjes transform of a function of bounded 
variation is in a one-to-one correspondence with the function, and Y^=i 1 — -^Mx) (u) is of 
bounded variation. It is therefore immediate that if E [logdet (I + pX)] = E [logdet (I + pY)], 
then YJ" =1 F Ai(x) (w) = IXi F A;(y) (w), a.e.. To prove the converse, assume YJ" =1 F Ai(X ) (it) = 
Er=i F a,(y) (w) a.e.. Then according to (42), E [logdet (I + pX)] = E [logdet (I + pY)]. 

Appendix C 
Proof of Theorem 1 

We begin by assuming X ^ c Y. By definition, E [log det (I + pX)] < E [log det (I + pY)] , Vp > 
0. Using the representation of the log-determinant in terms of the eigenvalues, we get 

X ^ c Y & / log(l + px) J2 /Ai(x) (x) dx< log(l + px) J2 /aj (y) (x) . (43) 
o 1=1 o 1=1 

Now, consider a RV Aq(X), which is uniformly picked from the set of all eigenvalues of X. 

Clearly, its density is given by n~ l YH=i /\i(x) (x). Therefore, (43) can be rewritten as 



oo oo 

X ^ c Y & J log(l + px)f XQ(x) {d)x< J log(l + px)f XQ(Y) (d) x , 



(44) 





which proves the direct part of the Theorem. The converse can be proved by assuming Aq(X) < c 
Aq(Y), and retracing the above steps. 
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Fig. 1. M-hop relay. S represents the source, R m represent the relays and D represents the destination. 
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Fig. 2. Ergodic capacity of amplify-forward relay with M = 3 slots. The instantaneous SINR is Pareto distributed with 
parameters fix = 1 (dashed line) and /3y = 3 (solid line). 




Fig. 3. M-user multiple access channel. 
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